This document contains the analysis and results for the event A day in daylight, where people from around the world measured a complete day of light exposure on (and around) 22 September 2025.
Note
Note that this script is optimized to generate plot outputs and objects to implement in a dashboard. Thus the direct outputs of the script might look distorted in places.
Importing data
We first set up all packages needed for the analysis
library(LightLogR)library(Hmisc)
Attaching package: 'Hmisc'
The following objects are masked from 'package:base':
format.pval, units
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
✖ dplyr::src() masks Hmisc::src()
✖ dplyr::summarize() masks Hmisc::summarize()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(gt)
Attaching package: 'gt'
The following object is masked from 'package:Hmisc':
html
Attaching package: 'rlang'
The following objects are masked from 'package:purrr':
%@%, flatten, flatten_chr, flatten_dbl, flatten_int, flatten_lgl,
flatten_raw, invoke, splice
library(gganimate)library(ggdist)
Attaching package: 'ggdist'
The following object is masked from 'package:rlang':
ll
library(sf)
Linking to GEOS 3.13.0, GDAL 3.8.5, PROJ 9.5.1; sf_use_s2() is TRUE
library(rnaturalearth)library(rnaturalearthdata)
Attaching package: 'rnaturalearthdata'
The following object is masked from 'package:rnaturalearth':
countries110
Attaching package: 'scales'
The following object is masked from 'package:purrr':
discard
The following object is masked from 'package:readr':
col_factor
Next we import the survey data. Data were collected with REDCap, and there is an import script to load the data in.
source("scripts/prep_survey_data.r")
Connecting light data with survey data
First, we collect a list of available data sets. As we need to compare them to the device ids in the survey, we require only the file without path or extension
Record ID 31 did not finish the post-survey, so we lack data on that device and consequently remove it. Furthermore, Record ID 30 only has data much outside the time frame of interest. Record ID 40 don’t has sensible data and will also be removed. Likely some issue with a dead battery is the reason.
We also have to clean up the city and country, as well as latitude and longitude data. We do this separately and load the data back in. The manual entries for locations had to be cleaned. This was done with OpenAI through an API key. The results were stored in the file data/cleaned/places.csv. Uncomment the code cell below to recreate the process. Details in outcome may vary, however.
# library(ellmer)# # data_devices_red <- # data_devices |> # select(record_id, city_country, latitude, longitude)# # chat <- chat_openai("If there is more then one place specified, only use the first one. If latitude and longitude are misspecified, make a best guess based on city_country. Use IANA names for the time zone identifieres")# # #reducing each line in a table to a single string# data_devices_red <- # data_devices_red |> # pmap(~ paste(paste(names(data_devices_red), c(...), sep = ": "), collapse = ", "))# # #creating an output structure# type_place <- type_object(# record_id = type_string(),# city = type_string(),# country = type_string(),# latitude = type_number(),# longitude = type_number(),# tz_identifier = type_string(),# UTC_dev = type_number("deviation from UTC in hours, given the 22 September 2025")# )# # places <-# parallel_chat_structured(# chat,# data_devices_red,# type = type_place# )# # write.csv(places, "data/cleaned/places.csv")
#read pre-cleaned data inplaces <-read_csv("data/cleaned/places.csv")
New names:
Rows: 49 Columns: 8
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(3): city, country, tz_identifier dbl (5): ...1, record_id, latitude,
longitude, UTC_dev
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
places <- places |> dplyr::mutate(record_id =as.character(record_id)) |>select(-`...1`)#merge data with main datadata_devices_cleaned <-data_devices |>select(-city_country, -latitude, -longitude) |>mutate(record_id =as.character(record_id)) |>left_join(places, by ="record_id") |>mutate(city =case_match(city,"Tuebingen"~"Tübingen","İzmir"~"Izmir",.default = city),country =case_match(country,"The Netherlands"~"Netherlands",c("Turkiye", "Türkiye") ~"Turkey",c("US", "United States", "USA") ~"United States of America","UK"~"United Kingdom",.default = country) )
First overview
The following code cells use the data imported so far to create some descriptive plots about the sample.
sex_lab <-primitive_bracket(key =key_range_manual( # <− positions + labelsstart =c(-7,0.1),end =c(-0.1,7),# -6 and +6 on the x-axisname =c("Males", "Females") ),position ="bottom"# draw it at the bottom of the panel)
Next, we import the light data. There are two devices in use: ActLumus and ActTrust we need to import them separately, as they are using different import functions. device_id with four number indicate the ActLumus devices, whereas with seven numbers the ActTrust. We add a column to the data indicating the Type of device in use. We also make sure that the spelling equals the supported_devices() list from LightLogR. Then we construct filename paths for all files.
data_devices_cleaned <-data_devices_cleaned |> dplyr::mutate(data = purrr::pmap(list(x = device_type, y = file_path, z = tz_identifier), \(x, y, z) {import_Dataset(device = x, filename = y, tz = z,silent =TRUE) } ) )
We end with one dataset per row entry. As the two ActTrust files do not contain a melanopic EDI column, we will use the photopic illuminance column LIGHT towards that end. As there are only two participants with this shortcoming, it will not influence results overduly.
Further, the dataset in Malaysia had a device malfunction on 22 September and only worked from the 23 September onwards. As there are minimal differences between dates and very few datasets in that region, we will not dismiss that dataset but rather shift data by one day. We need to shift the data by another 8 hours backwards, as that dataset was stored in UTC time (for some reason).
In this section we will prepare the light data through the following steps:
resampling data to 5 minute intervals
filling in missing data with explicit gaps
removing data that does not fall between 2025-09-21 10:00:00 UTC and 2025-09-23 12:00:00 UTC, which contains all times where 22 September occurs somewhere on the planet
data_devices <-data_devices_cleaned |> dplyr::mutate(data = purrr::map(data, \(x) { x |>aggregate_Datetime("5 mins") |>#resample to 5 minsgap_handler(full.days =TRUE) |>#put in explicit gapsfilter_Datetime(start ="2025-09-21 10:00:00",end ="2025-09-23 12:00:00",tz ="UTC") #cut out a section of data } ) )
Next, we add a secondary Datetime column that runs on UTC time.
In this section we deal with with the activity logs - first by filtering them out of the dataset, and selecting the relevant aspects.
events <-data |>filter(redcap_repeat_instrument =="log_a_new_activity") |>select(record_id, type.factor, social_context.factor, wear_activity.factor, nonwear_activity.factor, nighttime.factor, setting_level01.factor, setting_level02_indoors.factor, setting_level02_indoors_home.factor, setting_level02_indoors_workingspace.factor, setting_level02_indoors_healthfacility.factor, setting_level02_indoors_learningfacility.factor, setting_level02_indoors_leisurespace.factor, setting_level02_indoors_retailfacility.factor, setting_level02_mixed.factor, setting_level02_outdoors.factor, lighting_scenario_daylight___1.factor, lighting_scenario_daylight___2.factor, lighting_scenario_daylight___3.factor, lighting_scenario_daylight___4.factor, lighting_scenario_3___1.factor, lighting_scenario_3___2.factor, lighting_scenario_3___3.factor, lighting_scenario_2___1.factor, lighting_scenario_2___2.factor, lighting_scenario_2___3.factor, lighting_scenario_2___4.factor, autonomy.factor, notes, startdate, enddate )#adding labels to the factorslabel(events$type.factor) ="Wear type: Are you wearing the light logger at the moment?"label(events$social_context.factor) ="Are you alone or with others?"label(events$wear_activity.factor) ="Wear activity "label(events$nonwear_activity.factor) ="Non-wear activity"label(events$nighttime.factor) ="Where was the light logger when you were asleep?"label(events$setting_level01.factor) ="Select the setting"label(events$setting_level02_indoors.factor) ="Indoors setting"label(events$setting_level02_indoors_home.factor) ="Indoors setting (home)"label(events$setting_level02_indoors_workingspace.factor) ="Indoors setting (working space)"label(events$setting_level02_indoors_learningfacility.factor) ="Indoors setting (learning facility)"label(events$setting_level02_indoors_retailfacility.factor) ="Indoors setting (retail facility)"label(events$setting_level02_indoors_healthfacility.factor) ="Indoors setting (health facility)"label(events$setting_level02_indoors_leisurespace.factor) ="Indoors setting (leisure space)"label(events$setting_level02_outdoors.factor) ="Outdoors setting"label(events$setting_level02_mixed.factor) ="Indoors-outdoors setting"label(events$lighting_scenario_daylight___1.factor) ="Select lighting setting (daylight) (choice=Outdoors (direct sunlight))"label(events$lighting_scenario_daylight___2.factor) ="Select lighting setting (daylight) (choice=Outdoors (in shade / cloudy))"label(events$lighting_scenario_daylight___3.factor) ="Select lighting setting (daylight) (choice=Indoors (near window / exposed to daylight))"label(events$lighting_scenario_daylight___4.factor) ="Select lighting setting (daylight) (choice=Indoors (away from window))"label(events$lighting_scenario_3___1.factor) ="Select lighting setting (electric light) (choice=Lights are switched on)"label(events$lighting_scenario_3___2.factor) ="Select lighting setting (electric light) (choice=Low-light or dimmed lights)"label(events$lighting_scenario_3___3.factor) ="Select lighting setting (electric light) (choice=Completed darkness)"label(events$lighting_scenario_2___1.factor) ="Select lighting setting (screen use) (choice=Smartphone)"label(events$lighting_scenario_2___2.factor) ="Select lighting setting (screen use) (choice=Tablet)"label(events$lighting_scenario_2___3.factor) ="Select lighting setting (screen use) (choice=Computer)"label(events$lighting_scenario_2___4.factor) ="Select lighting setting (screen use) (choice=Television)"label(events$autonomy.factor) ="Were the lighting conditions in this setting self-selected (i.e., you had control over lighting intensity, spectrum, or exposure)?"
Next, we condense columns that can be expressed as one. We also lose the .factor extension, as now all doubles are removed. Finally, we simplify entries.
events <-events |>rename_with(\(x) x |>str_remove(".factor")) |>#remove .factor extension dplyr::mutate(type = type |>fct_relabel(\(x) str_remove(x, "-time| time| \\(not wearing light logger\\)")),across(c(wear_activity, setting_level01), \(x) x |>fct_relabel(\(y) str_remove(y, " \\(.*\\)")) ),nonwear_activity = nonwear_activity |>fct_recode("Dark mobile"="Left in a bag, or other mobile dark place","Dark stationary"="Left in a drawer or cabinet, or other stationary dark place","Stationary"="Left on a table or other surface with varying light exposure" ),nighttime = nighttime |>fct_recode("Upward"="Facing upward on bedside table","Downward"="Facing downward on bedside table" ),across(c(setting_level02_indoors, setting_level02_outdoors), \(x) x |>fct_recode("Leisure"="Leisure space (sports, recreation, entertainment)","Commercial"="Retail, food or services facility","Workplace"="Working space","Education"="Learning facility","Healthcare"="Health facility" ) ),setting_level01 = setting_level01 |>fct_recode("Mixed"="Indoor-outdoor setting" ),autonomy = autonomy |>fct_recode(Yes ="Yes, fully self-selected (e.g., adjusting lights at home or in a private office, moving to shaded area)",Partly ="Partly self-selected (e.g., some control such as opening blinds or switching a desk lamp, but not over main lighting)",No ="Not self-selected (e.g., public transport, shared office, classroom, hospital, airplane, etc.)",NULL ="Not applicable" ) ) |> dplyr::rename(setting_light = setting_level01)
In this step, we expand the light measurements with the event data. To this end, we need to specify start and end times for each log entry and thus state.
Lastly, a function to create the actual table. It takes both the duration_tibble() and histogram_plot() functions and creates a gt html table based on it.
duration_table <-function(variable) { variable_chr <- rlang::ensym(variable) |>as_string()duration_tibble({{ variable }}) |>arrange(desc(median)) |>gt(rowname_col = variable_chr) |>fmt_number(mean:max, decimals =0) |>cols_add(Plot = {{ variable }}, Plot2 = {{ variable }},.after = max) |>cols_add(rel_duration = total_duration/sum(total_duration), .after = total_duration) |>fmt_percent(rel_duration, decimals =0) |>cols_label(duration ="Episode duration",total_duration ="Total duration",episodes ="Episodes",mean ="Mean",min ="0%",q025 ="2.5%",q250 ="17%",median ="Median",q750 ="83%",q975 ="97.5%",max ="100%",Plot ="Histogram",Plot2 ="Timeline",rel_duration ="Relative duration" ) |>text_transform(locations =cells_body(Plot), fn = \(x) { gt::ggplot_image({ x %>%map(\(y) histogram_plot({{ variable }}, y)) }, height = gt::px(75), aspect_ratio =2) }) |>text_transform(locations =cells_body(Plot2), fn = \(x) { gt::ggplot_image({ x %>%map(\(y) timeline_plot({{ variable }}, y)) }, height = gt::px(75), aspect_ratio =2) }) |>fmt_duration(2:3, input_units ="seconds", max_output_units =2) |>tab_footnote("Scaled histograms show 66% of data in dark blue, 95% of data in dark & light blue, and the most extreme 5% of data in grey. The median is shown as a red dot.",locations =cells_column_labels(Plot) ) |>tab_footnote("Timeline plots show the median of the condition (yellow dot) against the rest of the data (grey dot). Colored bands show the interquartile range. Size of the dots indicates the relative number of instances, i.e., large dots indicate many occurances at a given time (relative to other times).",locations =cells_column_labels(Plot2) ) |>tab_footnote("Average duration of an episode in this category. eps. = Episode(s)",locations =cells_column_labels(duration) ) |>tab_footnote("An episode refers to a log entry in a category, until a different entry",locations =cells_column_labels(episodes) ) |># cols_units(mean = "lx",# median = "lx") |> gt::tab_spanner("Distribution", min:max) |>tab_footnote("Melanopic equivalent daylight illuminance (lx)",locations =list(cells_column_labels(c(mean)),cells_column_spanners() ) ) |>cols_merge_n_pct(total_duration, rel_duration) |>cols_merge(c(duration, episodes), pattern ="{1} ({2} eps.)" ) |>cols_align("center") |>tab_style(style =cell_fill(color ="grey"),locations =list(#cells_body(columns = c(min, max)),cells_column_labels(c(min, max))) ) |>tab_style(style =cell_fill(color ="tomato"),locations =list(#cells_body(median),cells_column_labels(median)) ) |>tab_style(style =cell_text(weight ="bold"),locations =list(cells_body(median),cells_stub(),cells_column_labels(median),cells_column_spanners()) ) |>tab_style(style =cell_fill(color ="#4880B8"),locations =list(#cells_body(columns = c(q250, q750)),cells_column_labels(c(q250, q750))) ) |>tab_style(style =cell_fill(color ="#C2DAE9"),locations =list(#cells_body(columns = c(q025, q975)),cells_column_labels(c(q025, q975))) ) |>cols_move_to_end(c(mean, duration, total_duration)) |>tab_header("Summary of melanopic EDI across log entries",subtitle =glue("Question/Characteristic: {label(light_data_ext[[variable_chr]])}") ) |>tab_style(style =cell_text(color ="red3"), locations =cells_stub(# columns = total_duration, # the column whose text you want to stylerows = rel_duration <0.05# condition using another column )) |>tab_footnote("Red indicates this category represents < 5% of entries",locations =list(cells_stub(rows = rel_duration <0.05) ) )}
label(light_data_ext$setting_light) <-"What is your general context?"label(light_data_ext$autonomy) <-"Were the lighting conditions in this setting self-selected?"label(light_data_ext$setting_indoors_workingspace) <-"Indoors setting (work)"label(light_data_ext$scenario_daylight2) <-"Daylight conditions (stratified)"label(light_data_ext$scenario_electric2) <-"Electric lighting conditions (stratified)"label(light_data_ext$country) <-"Country"label(light_data_ext$sex) <-"Sex"c("social_context", "type", "setting_light", "setting_light2", "country", "sex", "autonomy", "setting_indoors", "setting_location", "setting_outdoors", "setting_mixed","nighttime", "nonwear_activity", "type.detail", "setting_specific", "setting_indoors_home", "setting_indoors_workingspace", "setting_indoors_healthfacility", "setting_indoors_learningfacility", "setting_indoors_retailfacility", "scenario_daylight", "scenario_electric", "screen_phone", "screen_tablet", "screen_pc", "screen_tv", "behaviour_change", "travel_time_zone", "scenario_daylight2", "scenario_electric2", "wear_activity" )|>walk(\(x) { symbol <-sym(x) table <-duration_table(!!symbol) tablegtsave(table, glue("tables/table_duration_{x}.png"), vwidth =1200) })
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 20 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 20 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 20 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 15 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 15 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 15 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 17 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 17 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 15 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 16 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 20 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 17 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 20 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 21 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 22 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 15 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 17 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 19 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 20 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 10 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 20 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_ribbon()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_point()`).
Time above threshold
In this section we calculate the time above threshold for the single day of 22 September 2025 across latitude and country.
In this section we look at the global distribution of melanopic EDI, Time above 250lx, and the Dose of light.
Shade data
In this section we calculate day and night times around the globe. As we want to use this information for a looped visualization, we will set a slightly different cutoff and only collect 48 hours.
# Time window (UTC)t_start <-ymd_hms("2025-09-21 12:00:00", tz ="UTC")t_end <-ymd_hms("2025-09-23 12:00:00", tz ="UTC")# Step between frames (adjust to taste: "30 mins", "1 hour", etc.)time_step <-"15 mins"# Spatial grid resolution (degrees). 2° keeps things light; 1° looks smoother but is heavier.lon_step <-0.5lat_step <-0.5# Darkness mapping: fully dark at civil twilight (-6°), linearly increasing from 0 to 1 as altitude goes 0 -> -6°.dark_full_altitude <--6# degrees# ---------- build lon/lat grid ----------lons <-seq(-180, 180, by = lon_step)lats <-seq(-90, 90, by = lat_step)grid <-expand.grid(lon = lons, lat = lats) |>as_tibble()# ---------- time sequence ----------times_utc <-seq(t_start, t_end, by = time_step)# ---------- compute sun altitude for each (lon, lat, time) ----------# We'll loop over time slices (efficient enough for this grid size),# compute altitude (in radians from suncalc), convert to degrees,# then map to a "darkness" alpha value.
compute_slice <-function(tt) {# build a data frame with the timestamp repeated for each grid point df <-data.frame(date =rep(tt, nrow(grid)),lat = grid$lat,lon = grid$lon )# call suncalc with a data frame instead of separate lat/lon vectors sp <- suncalc::getSunlightPosition(data = df, keep =c("altitude"))# altitude is returned in radians; convert to degrees alt_deg <- sp$altitude *180/ pi# darkness mapping (same as before) darkness <-pmin(1, pmax(0, -alt_deg /6))tibble(lon = grid$lon,lat = grid$lat,time = tt,alt_deg = alt_deg,darkness = darkness )}
# Create the progress bar oncepb <- progress_bar$new(format ="Computing slices [:bar] :percent eta: :eta",total =length(times_utc), clear =FALSE, width =70)shade_list <-vector("list", length(times_utc))for (i inseq_along(times_utc)) {# Compute one time slice (without pb$tick() inside) shade_list[[i]] <-compute_slice(times_utc[[i]])# Now update the bar pb$tick()}shade_df <- dplyr::bind_rows(shade_list)shade_df <-bind_rows(lapply(times_utc, compute_slice))
---title: "A day in daylight"author: "Johannes Zauner"format: html: self-contained: true code-tools: true toc: true code-linking: true---## PrefaceThis document contains the analysis and results for the event **A day in daylight**, where people from around the world measured a complete day of light exposure on (and around) **22 September 2025**.::: {.callout-note}Note that this script is optimized to generate plot outputs and objects to implement in a dashboard. Thus the direct outputs of the script might look distorted in places.:::## Importing dataWe first set up all packages needed for the analysis```{r}#| label: setuplibrary(LightLogR)library(Hmisc)library(tidyverse)library(gt)library(patchwork)library(gtExtras)library(gtsummary)library(legendry)library(rlang)library(gganimate)library(ggdist)library(sf)library(rnaturalearth)library(rnaturalearthdata)library(ggplot2)library(glue)library(suncalc)library(scales)library(progress)# source("https://raw.githubusercontent.com/MeLiDosProject/Data_Metadata_Conventions/main/scripts/overview_plot.R")```Next we import the survey data. Data were collected with REDCap, and there is an import script to load the data in.```{r}#| label: import surveysource("scripts/prep_survey_data.r")```### Connecting light data with survey dataFirst, we collect a list of available data sets. As we need to compare them to the device ids in the survey, we require only the file without path or extension```{r}#| label: light datasetspath_light <-"data/lightloggers"files_light <-list.files(path_light) |> tools::file_path_sans_ext()```Next we check which devices are declared in the survey.```{r}#| label: survey devicessurvey_devices <- data |>drop_na(device_id) |>pull(device_id) #get devicessurvey_devices |>anyDuplicated() #are any entries duplicated?: Nosurvey_devices |>setequal(files_light) #are light files and survey entries equal?: Yes```No entries are duplicated and the survey device Ids are equal to the light files. ### Device and location informationNext, we need to get the time zones of the participants and their coordinates. For this, let's reduce the complexity of the dataset and clean the data```{r}#| label: collect device infodata_devices <-data |>drop_na(device_id) |>select(device_id, record_id, city_country, latitude, longitude, age, sex = sex.factor,complete_log = complete_log.factor,behaviour_change = behaviour_change.factor, travel_time_zone) |>mutate(travel_time_zone = travel_time_zone ==1)label(data_devices$travel_time_zone) ="Time zone travel"label(data_devices$age) ="age"label(data_devices$behaviour_change) ="Behaviour change"data_devices |>gt() |>opt_interactive()````Record ID 31` did not finish the post-survey, so we lack data on that device and consequently remove it. Furthermore, `Record ID 30` only has data much outside the time frame of interest. `Record ID 40` don't has sensible data and will also be removed. Likely some issue with a dead battery is the reason.```{r}#| label: remove ID 31data_devices <- data_devices |>filter(!record_id %in%c("31", "30", "44"))```We also have to clean up the city and country, as well as latitude and longitude data. We do this separately and load the data back in.The manual entries for locations had to be cleaned. This was done with OpenAI through an API key. The results were stored in the file `data/cleaned/places.csv`. Uncomment the code cell below to recreate the process. Details in outcome may vary, however.```{r}#| label: clean location data with AI# library(ellmer)# # data_devices_red <- # data_devices |> # select(record_id, city_country, latitude, longitude)# # chat <- chat_openai("If there is more then one place specified, only use the first one. If latitude and longitude are misspecified, make a best guess based on city_country. Use IANA names for the time zone identifieres")# # #reducing each line in a table to a single string# data_devices_red <- # data_devices_red |> # pmap(~ paste(paste(names(data_devices_red), c(...), sep = ": "), collapse = ", "))# # #creating an output structure# type_place <- type_object(# record_id = type_string(),# city = type_string(),# country = type_string(),# latitude = type_number(),# longitude = type_number(),# tz_identifier = type_string(),# UTC_dev = type_number("deviation from UTC in hours, given the 22 September 2025")# )# # places <-# parallel_chat_structured(# chat,# data_devices_red,# type = type_place# )# # write.csv(places, "data/cleaned/places.csv")``````{r}#| label: merge cleaned data#read pre-cleaned data inplaces <-read_csv("data/cleaned/places.csv")places <- places |> dplyr::mutate(record_id =as.character(record_id)) |>select(-`...1`)#merge data with main datadata_devices_cleaned <-data_devices |>select(-city_country, -latitude, -longitude) |>mutate(record_id =as.character(record_id)) |>left_join(places, by ="record_id") |>mutate(city =case_match(city,"Tuebingen"~"Tübingen","İzmir"~"Izmir",.default = city),country =case_match(country,"The Netherlands"~"Netherlands",c("Turkiye", "Türkiye") ~"Turkey",c("US", "United States", "USA") ~"United States of America","UK"~"United Kingdom",.default = country) )```### First overviewThe following code cells use the data imported so far to create some descriptive plots about the sample.```{r}#| label: sex axissex_lab <-primitive_bracket(key =key_range_manual( # <− positions + labelsstart =c(-7,0.1),end =c(-0.1,7),# -6 and +6 on the x-axisname =c("Males", "Females") ),position ="bottom"# draw it at the bottom of the panel)``````{r}#| label: sex and gender distributionPlot_demo <-data_devices_cleaned |>mutate(age_group =cut(age, breaks =seq(15,70,5), labels =c("18-20", "21-25", "26-30", "31-35", "36-40", "41-45", "46-50", "51-55", "56-60", "61-65", "66-70"),right =TRUE, ordered_result =TRUE), ) |>group_by(sex, age_group) |> dplyr::summarize(n =n(), .groups ="drop") |>mutate(n =ifelse(sex =="Male", -n, n)) |>ggplot(aes(x= age_group, y = n, fill = sex)) +geom_col() +geom_hline(yintercept =0) +scale_y_continuous(breaks =seq(-6,6, by =2), labels =c(6, 4, 2, 0, 2, 4, 6)) +scale_fill_manual(values =c(Male ="#2D6D66", Female ="#A23B54")) +guides(fill ="none", alpha ="none",x =guide_axis_stack("axis", sex_lab )) +theme_minimal() +coord_flip(ylim =c(-7, 7)) +labs(x ="Age (yrs)", y ="n")Plot_demoggsave("figures/Fig1_age.png", width =6, height =6, units ="cm", scale =1.6)``````{r}#| label: statistics about locationslocation_stats <-data_devices_cleaned |> dplyr::summarise(tz =n_distinct(UTC_dev),country =n_distinct(country),n =n() ) |>pivot_longer(cols =everything() ) |> dplyr::mutate(name =case_match(name,"country"~"Countries","tz"~"Time zones","n"~"Participants"),name =factor(name, levels =c("Time zones", "Countries", "Participants")) )P_stats <-location_stats |>ggplot(aes(y =fct_rev(name), x = value, fill = name)) +geom_col() +geom_text(aes(label = value), color ="white", hjust =1.2, fontface =2, size =3) +theme_minimal() +theme_sub_panel(grid =element_blank()) +theme_sub_axis_bottom(text =element_blank()) +theme_sub_plot(background =element_rect(fill =alpha("white", 0.75))) +labs(x =NULL, y =NULL) +scale_fill_manual(values =c(`Time zones`="deepskyblue3",Participants ="red",Countries ="grey")) +guides(fill ="none")P_tz <- data_devices_cleaned |>group_by(UTC_dev) |> dplyr::summarise(n =n()) |>ggplot(aes(x=UTC_dev, y = n)) +geom_vline(xintercept =0, col ="grey") +geom_hline(yintercept =0, col ="grey") +geom_col(fill ="deepskyblue3")+geom_text(aes(label = n), fontface =2, vjust =-0.2) +theme_minimal() +theme_sub_panel(grid.major.y =element_blank(),grid.minor =element_blank()) +theme_sub_axis_left(text =element_blank()) +scale_x_continuous(breaks =seq(-12, 12, 2)) +labs(x ="Deviation from UTC (h) on 22 Sep 2025", y ="n") +coord_cartesian(xlim =c(-11,11), ylim =c(NA, 30))``````{r}#| label: overview map of the participantsworld <-ne_countries(scale ="medium", returnclass ="sf")countries_colors <-tibble(country = data_devices_cleaned |> dplyr::count(country) |>pull(country),color ="#0073C2FF",stringsAsFactors =FALSE )world$color <-ifelse( world$name %in% countries_colors$country, countries_colors$country[match(world$name, countries_colors$country)],NA )location_info <-tibble(country = data_devices_cleaned |>pull(country),lat = data_devices_cleaned |>pull(latitude),lon = data_devices_cleaned |>pull(longitude),color ="#0073C2FF",stringsAsFactors =FALSE ) |>mutate(lat2 = plyr::round_any(lat, 12), lon2 = plyr::round_any(lon, 12)) |> dplyr::summarize(.by =c(lat2, lon2),lat =mean(lat),lon =mean(lon),color =unique(color),n =n() )locations <-st_as_sf(location_info, coords =c("lon", "lat"), crs =4326)world_proj <-st_transform(world, crs ="+proj=eqc")locations_proj <-st_transform(locations, crs ="+proj=eqc")bb <-st_bbox(world_proj)tz <- sf::st_read("data/tz_now/combined-shapefile-with-oceans-now.shp") # or .gpkg / .geojsontz_lines <- sf::st_boundary(tz)P_map <-ggplot() +geom_sf(data = world_proj,# aes(fill = color),fill ="grey",color =NA,size =0.25,alpha =0.5,show.legend =FALSE ) +geom_sf(data = tz_lines,colour ="deepskyblue3",linewidth =0.15) +geom_sf(data = locations_proj,aes(size = n),fill ="red",alpha =0.9,shape =21,color ="#0073C2FF",stroke =0.2 ) +geom_sf_text(data = locations_proj,aes(label = n),size =1.5,fontface =2,color ="white",alpha =0.75 ) +scale_fill_manual(values =rep("#0073C2FF", 15)) +scale_size_continuous(range =c(2, 5)) +theme_minimal() +theme(legend.position ="none") +labs(x =NULL, y =NULL) +coord_sf(expand =FALSE)``````{r}#| label: combined location plot#| fig-height: 9.5#| fig-width: 14P_map_patch <-(P_map +inset_element(P_stats, 0.05, 0.05, 0.25, 0.25)) / P_tz +plot_layout(heights =c(4.4,1))ggsave("figures/Fig2_location.pdf",P_map_patch,width =15, height =10, units ="cm", scale =1.6)ggsave("figures/Fig2_location.png", P_map_patch,width =14, height =9.5, units ="cm", scale =1.6)```### Import wearable dataNext, we import the light data. There are two devices in use: `ActLumus` and `ActTrust` we need to import them separately, as they are using different import functions. `device_id` with four number indicate the `ActLumus` devices, whereas with seven numbers the `ActTrust`. We add a column to the data indicating the Type of device in use. We also make sure that the spelling equals the `supported_devices()` list from `LightLogR`. Then we construct filename paths for all files.```{r}#| label: collect wearable infoc("ActLumus", "ActTrust") %in%supported_devices()data_devices_cleaned <-data_devices_cleaned |> dplyr::mutate(device_type =case_when(str_length(device_id) ==4~"ActLumus",str_length(device_id) ==7~"ActTrust" ),file_path =glue("data/lightloggers/{device_id}.txt") )``````{r}#| label: import filesdata_devices_cleaned <-data_devices_cleaned |> dplyr::mutate(data = purrr::pmap(list(x = device_type, y = file_path, z = tz_identifier), \(x, y, z) {import_Dataset(device = x, filename = y, tz = z,silent =TRUE) } ) )```We end with one dataset per row entry. As the two `ActTrust` files do not contain a melanopic EDI column, we will use the photopic illuminance column `LIGHT` towards that end. As there are only two participants with this shortcoming, it will not influence results overduly.```{r}#| label: ActTrust MEDI variabledata_devices_cleaned <-data_devices_cleaned |> dplyr::mutate(data = purrr::map2(device_type, data, \(x,y) {if(x =="ActTrust") { y |> dplyr::rename(MEDI = LIGHT) }else y } ) )```Further, the dataset in Malaysia had a device malfunction on 22 September and only worked from the 23 September onwards. As there are minimal differences between dates and very few datasets in that region, we will not dismiss that dataset but rather shift data by one day. We need to shift the data by another 8 hours backwards, as that dataset was stored in UTC time (for some reason).```{r}#| label: Shift one day for Malaysiadata_devices_cleaned <-data_devices_cleaned |> dplyr::mutate(data = purrr::map2(record_id, data, \(x,y) {if(x =="25") { y |> dplyr::mutate(Datetime = Datetime -ddays(1) -dhours(8) ) }else y } ) )```## Light data### Cleaning light dataIn this section we will prepare the light data through the following steps:- resampling data to 5 minute intervals- filling in missing data with explicit gaps- removing data that does not fall between `2025-09-21 10:00:00 UTC` and `2025-09-23 12:00:00 UTC`, which contains all times where 22 September occurs *somewhere* on the planet```{r}#| label: Cleanup of light datadata_devices <-data_devices_cleaned |> dplyr::mutate(data = purrr::map(data, \(x) { x |>aggregate_Datetime("5 mins") |>#resample to 5 minsgap_handler(full.days =TRUE) |>#put in explicit gapsfilter_Datetime(start ="2025-09-21 10:00:00",end ="2025-09-23 12:00:00",tz ="UTC") #cut out a section of data } ) )```Next, we add a secondary `Datetime` column that runs on UTC time.```{r}#| label: add UTC datadata_devices <-data_devices |> dplyr::mutate(data = purrr::map(data, \(x) { x |> dplyr::mutate(Datetime_UTC = Datetime |>force_tz("UTC")) } ) )```### Visualizing light dataNow we can visualize the whole dataset - first by combining all datasets.```{r}#| label: period of interest#| warning: falsestart_dt <-as.POSIXct("2025-09-21 10:00:00", tz ="UTC")start_dt2 <-as.POSIXct("2025-09-22 00:00:00", tz ="UTC")end_dt <-as.POSIXct("2025-09-23 12:00:00", tz ="UTC")end_dt2 <-as.POSIXct("2025-09-23 00:00:00", tz ="UTC")``````{r}#| label: combine light data#| warning: falselight_data <-join_datasets(!!!data_devices$data) |>mutate(Datetime = Datetime |>with_tz("UTC"))``````{r}#| label: visualize light data#| warning: falselight_data |>aggregate_Datetime("1hour") |>gg_days(facetting =FALSE, group = Id, geom ="ribbon",lwd =0.25,fill ="skyblue3",color ="skyblue4",alpha =0.1,y.axis.label ="UTC Time" ) +geom_vline(xintercept =c(start_dt, end_dt), color ="red")light_data |>aggregate_Datetime("1hour") |>gg_days(Datetime_UTC,geom ="ribbon",facetting =FALSE,fill ="skyblue3",color ="skyblue4",alpha =0.1,group = Id, lwd =0.25,y.axis.label ="Local Time" ) +geom_vline(xintercept =c(start_dt2, end_dt2), color ="red")``````{r}#| label: animate light data#| warning: falselight_data |>aggregate_Datetime("1hour") |>gg_days(Datetime_UTC,facetting =FALSE, group = Id, lwd =0.25,y.axis.label ="UTC Time" ) +geom_vline(xintercept =c(start_dt2, end_dt2), color ="red")boundaries <-tibble(start =c(start_dt, start_dt2),end =c(end_dt, end_dt2),name =c("UTC Time", "Local Time"))p <-light_data |>aggregate_Datetime("2 hours") |>select(Id, Datetime, Datetime_UTC, MEDI) |>pivot_longer(-c(Id, MEDI)) |>mutate(name =case_match(name,"Datetime"~"UTC Time","Datetime_UTC"~"Local Time")) |> dplyr::mutate(name =factor(name)) |>gg_days(value,geom ="ribbon",fill ="skyblue3",alpha =0.4,color ="black",facetting =FALSE, group = Id, lwd =0.1,x.axis.label ="{next_state} {if(transitioning) '(transitioning)' else ''}",y.axis.label ="Melanopic EDI (lx)",x.axis.breaks = \(x) Datetime_breaks(x, by ="6 hours", shift =0),x.axis.format ="%H:%M" ) +geom_vline(data = boundaries, aes(xintercept=start), col ="red", lty =2, inherit.aes =FALSE)+geom_vline(data = boundaries, aes(xintercept=end), col ="red", lty =2, inherit.aes =FALSE)+geom_segment(data = boundaries, aes(y =25000, x = start, xend = end), arrow =arrow(length =unit(0.1, "inches"), ends ="both"), col ="red", inherit.aes =FALSE)+annotate(geom ="text", y =25000, x =mean(c(start_dt2, end_dt2)), vjust =-0.4, label ="Global 22 September", col ="red") +transition_states( name, transition_length =1,state_length =1 )if(interactive()){animation <-animate(p, width =1200, height =700, res =150)animationanim_save("figures/patterns.gif", animation)}```## Events### Cleaning eventsIn this section we deal with with the activity logs - first by filtering them out of the dataset, and selecting the relevant aspects.```{r}#| label: filter eventsevents <-data |>filter(redcap_repeat_instrument =="log_a_new_activity") |>select(record_id, type.factor, social_context.factor, wear_activity.factor, nonwear_activity.factor, nighttime.factor, setting_level01.factor, setting_level02_indoors.factor, setting_level02_indoors_home.factor, setting_level02_indoors_workingspace.factor, setting_level02_indoors_healthfacility.factor, setting_level02_indoors_learningfacility.factor, setting_level02_indoors_leisurespace.factor, setting_level02_indoors_retailfacility.factor, setting_level02_mixed.factor, setting_level02_outdoors.factor, lighting_scenario_daylight___1.factor, lighting_scenario_daylight___2.factor, lighting_scenario_daylight___3.factor, lighting_scenario_daylight___4.factor, lighting_scenario_3___1.factor, lighting_scenario_3___2.factor, lighting_scenario_3___3.factor, lighting_scenario_2___1.factor, lighting_scenario_2___2.factor, lighting_scenario_2___3.factor, lighting_scenario_2___4.factor, autonomy.factor, notes, startdate, enddate )#adding labels to the factorslabel(events$type.factor) ="Wear type: Are you wearing the light logger at the moment?"label(events$social_context.factor) ="Are you alone or with others?"label(events$wear_activity.factor) ="Wear activity "label(events$nonwear_activity.factor) ="Non-wear activity"label(events$nighttime.factor) ="Where was the light logger when you were asleep?"label(events$setting_level01.factor) ="Select the setting"label(events$setting_level02_indoors.factor) ="Indoors setting"label(events$setting_level02_indoors_home.factor) ="Indoors setting (home)"label(events$setting_level02_indoors_workingspace.factor) ="Indoors setting (working space)"label(events$setting_level02_indoors_learningfacility.factor) ="Indoors setting (learning facility)"label(events$setting_level02_indoors_retailfacility.factor) ="Indoors setting (retail facility)"label(events$setting_level02_indoors_healthfacility.factor) ="Indoors setting (health facility)"label(events$setting_level02_indoors_leisurespace.factor) ="Indoors setting (leisure space)"label(events$setting_level02_outdoors.factor) ="Outdoors setting"label(events$setting_level02_mixed.factor) ="Indoors-outdoors setting"label(events$lighting_scenario_daylight___1.factor) ="Select lighting setting (daylight) (choice=Outdoors (direct sunlight))"label(events$lighting_scenario_daylight___2.factor) ="Select lighting setting (daylight) (choice=Outdoors (in shade / cloudy))"label(events$lighting_scenario_daylight___3.factor) ="Select lighting setting (daylight) (choice=Indoors (near window / exposed to daylight))"label(events$lighting_scenario_daylight___4.factor) ="Select lighting setting (daylight) (choice=Indoors (away from window))"label(events$lighting_scenario_3___1.factor) ="Select lighting setting (electric light) (choice=Lights are switched on)"label(events$lighting_scenario_3___2.factor) ="Select lighting setting (electric light) (choice=Low-light or dimmed lights)"label(events$lighting_scenario_3___3.factor) ="Select lighting setting (electric light) (choice=Completed darkness)"label(events$lighting_scenario_2___1.factor) ="Select lighting setting (screen use) (choice=Smartphone)"label(events$lighting_scenario_2___2.factor) ="Select lighting setting (screen use) (choice=Tablet)"label(events$lighting_scenario_2___3.factor) ="Select lighting setting (screen use) (choice=Computer)"label(events$lighting_scenario_2___4.factor) ="Select lighting setting (screen use) (choice=Television)"label(events$autonomy.factor) ="Were the lighting conditions in this setting self-selected (i.e., you had control over lighting intensity, spectrum, or exposure)?"```Next, we condense columns that can be expressed as one. We also lose the `.factor` extension, as now all doubles are removed. Finally, we simplify entries.```{r}#| label: simplify factorsevents <-events |>rename_with(\(x) x |>str_remove(".factor")) |>#remove .factor extension dplyr::mutate(type = type |>fct_relabel(\(x) str_remove(x, "-time| time| \\(not wearing light logger\\)")),across(c(wear_activity, setting_level01), \(x) x |>fct_relabel(\(y) str_remove(y, " \\(.*\\)")) ),nonwear_activity = nonwear_activity |>fct_recode("Dark mobile"="Left in a bag, or other mobile dark place","Dark stationary"="Left in a drawer or cabinet, or other stationary dark place","Stationary"="Left on a table or other surface with varying light exposure" ),nighttime = nighttime |>fct_recode("Upward"="Facing upward on bedside table","Downward"="Facing downward on bedside table" ),across(c(setting_level02_indoors, setting_level02_outdoors), \(x) x |>fct_recode("Leisure"="Leisure space (sports, recreation, entertainment)","Commercial"="Retail, food or services facility","Workplace"="Working space","Education"="Learning facility","Healthcare"="Health facility" ) ),setting_level01 = setting_level01 |>fct_recode("Mixed"="Indoor-outdoor setting" ),autonomy = autonomy |>fct_recode(Yes ="Yes, fully self-selected (e.g., adjusting lights at home or in a private office, moving to shaded area)",Partly ="Partly self-selected (e.g., some control such as opening blinds or switching a desk lamp, but not over main lighting)",No ="Not self-selected (e.g., public transport, shared office, classroom, hospital, airplane, etc.)",NULL ="Not applicable" ) ) |> dplyr::rename(setting_light = setting_level01)``````{r}#| label: combining factorsevents <-events |> dplyr::mutate(scenario_daylight =case_when( (lighting_scenario_daylight___1 =="Checked") & (setting_light %in%c("Outdoors", "Mixed")) ~"Direct sunlight", (lighting_scenario_daylight___2 =="Checked") & (setting_light %in%c("Outdoors", "Mixed")) ~"Shade / cloudy", (lighting_scenario_daylight___3 =="Checked") & (setting_light %in%c("Indoors", "Mixed")) ~"Near a window", (lighting_scenario_daylight___4 =="Checked") & (setting_light %in%c("Indoors", "Mixed")) ~"Away from window" ),scenario_electric =case_when( lighting_scenario_3___3 =="Checked"~"Lights off", lighting_scenario_3___2 =="Checked"~"Dim light", lighting_scenario_3___1 =="Checked"~"Lights on", ),across(starts_with("lighting_scenario_2___"), \(x) ifelse(x =="Checked", TRUE, FALSE)), ) |> dplyr::rename(screen_phone = lighting_scenario_2___1,screen_tablet = lighting_scenario_2___2,screen_pc = lighting_scenario_2___3,screen_tv = lighting_scenario_2___4 ) |>select(-starts_with("lighting_scenario")) |> dplyr::mutate(wear_activity =case_when(type =="Wear"~ wear_activity, .default =NA),nonwear_activity =case_when(type =="Non-wear"~ nonwear_activity, .default =NA),nighttime =case_when(type =="Bedtime"~ nighttime, .default =NA),setting_level02_mixed =case_when(setting_light =="Mixed"~ setting_level02_mixed, .default =NA),setting_level02_indoors =case_when(setting_light =="Indoors"~ setting_level02_indoors, .default =NA),setting_level02_outdoors =case_when(setting_light =="Outdoors"~ setting_level02_outdoors, .default =NA),setting_level02_indoors_leisurespace =case_when(setting_level02_indoors =="Leisure"~ setting_level02_indoors_leisurespace, .default =NA),setting_level02_indoors_workingspace =case_when(setting_level02_indoors =="Workplace"~ setting_level02_indoors_workingspace, .default =NA), ) |>unite("type.detail", c(wear_activity, nonwear_activity, nighttime), na.rm =TRUE,remove =FALSE) |>unite("setting_location", c(setting_level02_indoors, setting_level02_outdoors, setting_level02_mixed), na.rm =TRUE, remove =FALSE) |>unite("setting_specific", starts_with("setting_level02_indoors_"), na.rm =TRUE, remove =FALSE) |> dplyr::rename_with(\(x) x |>str_remove("_level02")) |>relocate(scenario_daylight, scenario_electric, .before = screen_phone) |>relocate(startdate, .before =1) |>select(-enddate) |> dplyr::mutate(across(c(setting_location, setting_specific, type.detail), \(x) fct_recode(x, NULL ="")) )``````{r}#| label: joining participant informationpart_data <- data_devices |>select(-data) |>mutate(record_id =as.character(record_id))events_complete <-events |> dplyr::mutate(record_id =as.character(record_id)) |>left_join(part_data, by ="record_id") |>drop_na(tz_identifier)label(events_complete$record_id) ="Record ID"events_complete <-events_complete |> dplyr::mutate(Datetime =as.POSIXct(startdate, tz ="UTC"),UTC_dt =force_tzs(Datetime, tz_identifier),.before =1) |>select(-startdate)```### SummariesIn this section we will calculate some summary statistics regarding events```{r}#| label: duration of statesevents_complete <-events_complete |> dplyr::mutate(status.duration =c(diff(Datetime), na_dbl), .by = record_id,.after = Datetime) |>filter(!record_id %in%c("31", "30", "44"))``````{r}#| label: adding labelslabel(events_complete$status.duration) ="Time between log entries"label(events_complete$type.detail) ="Wear/Non-wear context"label(events_complete$setting_location) ="General setting"label(events_complete$setting_specific) ="Specific indoor setting"label(events_complete$scenario_daylight) ="Daylight conditions"label(events_complete$scenario_electric) ="Electric lighting conditions"label(events_complete$screen_phone) ="Phone use"label(events_complete$screen_tablet) ="Tablet use"label(events_complete$screen_pc) ="Computer use"label(events_complete$screen_tv) ="Television use"label(events_complete$sex) ="Sex"label(events_complete$city) ="City"label(events_complete$country) ="Country"label(events_complete$latitude) ="Latitude"label(events_complete$longitude) ="Longitude"label(events_complete$tz_identifier) ="Time zone identifier"label(events_complete$UTC_dev) ="Time zone deviation from UTC"label(events_complete$device_type) ="Used device"label(events_complete$file_path) ="File path"label(events_complete$setting_indoors) ="Indoor settings"label(events_complete$setting_outdoors) ="Outdoor settings"label(events_complete$setting_mixed) ="Outdoor-Indoor mixed settings"label(events_complete$wear_activity) ="Activity"label(events_complete$nonwear_activity) ="Non-wear wearable position"label(events_complete$nighttime) ="Nightstand wearable measurement direction"``````{r}#| label: overall summariesevent_summary1 <-events_complete |> dplyr::summarize(.by = record_id,n =n(),mean.duration =mean(status.duration, na.rm =TRUE),covered.timespan =last(Datetime) -first(Datetime) )label(event_summary1$n) ="Log entries"label(event_summary1$mean.duration) ="Mean duration between log entries"units(event_summary1$mean.duration) ="hours"label(event_summary1$covered.timespan) ="Total time span of log entries"event_summary1.tbl <-event_summary1 |>tbl_summary(include =-record_id,statistic =list(all_continuous() ~"{mean} ({min}-{max})") ) |>modify_caption("**Activity logging (by-participant level)**")event_summary1.tblgtsave(event_summary1.tbl |>as_gt(), "tables/table1.png")``````{r}#| label: detailed summariesevent_tbl_setting <-events_complete |>select(setting_indoors, setting_outdoors, setting_mixed ) |>tbl_summary(missing ="no" )event_tbl_settinggtsave(event_tbl_setting |>as_gt(), "tables/table2.png")event_tbl_indoors <-events_complete |>select(setting_specific ) |>tbl_summary(missing ="no" )event_tbl_indoorsgtsave(event_tbl_indoors |>as_gt(), "tables/table3.png")event_tbl_wear <-events_complete |>select(type, wear_activity, nonwear_activity, nighttime ) |>tbl_summary(missing ="no" )event_tbl_weargtsave(event_tbl_wear |>as_gt(), "tables/table4.png")event_tbl_other <-events_complete |>select(social_context, setting_light, scenario_daylight, scenario_electric, autonomy ) |>tbl_summary(missing_text ="Missing" )event_tbl_othergtsave(event_tbl_other |>as_gt(), "tables/table4.png")``````{r}#| label: time in condition summaryevent_tbl_duration <-events_complete |># drop_na(setting_light) |> dplyr::mutate(setting_light =case_when(type =="Bedtime"~"Bed", type =="Non-wear"~"Non-wear",.default = setting_light |>as.character()),setting_light = setting_light |>factor(levels =c("Bed", "Indoors", "Mixed", "Outdoors", "Non-wear"))) |> dplyr::summarize(`Daily duration`=sum(status.duration, na.rm =TRUE),.by =c(setting_light, record_id)) |> dplyr::summarize(`Daily duration`=mean(`Daily duration`),.by =c(setting_light)) |> dplyr::mutate(`Daily duration`=`Daily duration`/sum(as.numeric(`Daily duration`)) *24*60*60,Percent = (as.numeric(`Daily duration`)/sum(as.numeric(`Daily duration`)))# vec_fmt_percent() ) |>arrange(setting_light) |>gt(rowname_col ="setting_light") |>grand_summary_rows( fns =list( sum ~sum(.) ),fmt =list(~fmt_percent(., columns = Percent, decimals =0), ~fmt_duration(., columns =`Daily duration`, input_units ="secs",max_output_units =2)) ) |>fmt_duration(`Daily duration`, input_units ="secs",max_output_units =2) |>fmt_percent(columns = Percent, decimals =0) |>sub_missing() |>tab_header(title ="Mean daily duration in condition")event_tbl_durationgtsave(event_tbl_duration, "tables/table5.png")```### Combining Events with light dataIn this step, we expand the light measurements with the event data. To this end, we need to specify start and end times for each log entry and thus state.```{r}#| label: combining light and log dataevents_light <-events_complete |>rename(Id = device_id) |>group_by(Id) |> dplyr::mutate(setting_light2 = setting_light,setting_light =case_when(type =="Bedtime"~"Bed", type =="Non-wear"~"Non-wear",.default = setting_light |>as.character()),setting_light = setting_light |>factor(levels =c("Bed", "Indoors", "Mixed", "Outdoors", "Non-wear")),end =case_when(is.na(lead(Datetime)) ~ (end_dt +ddays(1)),# Datetime >= end_dt ~ Datetime,!is.na(lead(Datetime)) ~lead(Datetime) ) |>as.POSIXct(tz ="UTC"),Id =factor(Id),.after = Datetime ) |>rename(start = Datetime) |>select(-c(age:file_path, status.duration, UTC_dt))#adding event-specific informationlight_data_ext <-light_data |>select(Id, Datetime, Datetime_UTC, MEDI) |>add_states(events_light, Datetime.colname = Datetime_UTC) |>rename(DatetimeUTC = Datetime) |>rename(Datetime = Datetime_UTC) |>unite( scenario_daylight2, setting_light, scenario_daylight, sep =": ", remove =FALSE, na.rm =TRUE ) |> dplyr::mutate(scenario_daylight2 =case_when(setting_light %in%c("Outdoors", "Mixed", "Indoors", "Bed") &!is.na(scenario_daylight) ~paste(setting_light, scenario_daylight, sep =": ")),scenario_electric2 =case_when(setting_light %in%c("Outdoors", "Indoors", "Mixed", "Bed") &!is.na(scenario_electric) ~paste(setting_light, scenario_electric, sep =": ")) )#adding participant-specific informationparticipant_data <- data_devices_cleaned |>select(-record_id, -data) |>rename(Id = device_id) |> dplyr::mutate(Id =factor(Id))light_data_ext <- light_data_ext |>left_join(participant_data, by ="Id")``````{r}#| label: plotting light data#| warning: false#| message: false#| fig-height: 20#| fig-width: 20participants <- light_data_ext |>pull(Id) |>levels()light_data_ext |>gg_days(facetting =FALSE,y.axis.label ="Melanopic EDI (lx)") |>gg_state(setting_light, aes_fill = setting_light, alpha =0.5) +scale_fill_manual(values =c(Bed ="darkgrey",`Non-wear`="tomato",Indoors ="gold",Outdoors ="skyblue2",Mixed ="lightgreen" ) ) +facet_wrap(~Id, ncol =5) +labs(fill ="Condition")ggsave("figures/Fig3_overview.png", width =10, height =10, scale =2)light_data_ext |>filter(Id %in% participants[1:24]) |>gg_days(facetting =FALSE,y.axis.label ="Melanopic EDI (lx)") |>gg_state(setting_light, aes_fill = setting_light, alpha =0.5) +scale_fill_manual(values =c(Bed ="darkgrey",`Non-wear`="tomato",Indoors ="gold",Outdoors ="skyblue2",Mixed ="lightgreen" ) ) +facet_wrap(~Id, ncol =8) +labs(fill ="Condition")ggsave("figures/Fig3a_overview.png", width =14, height =5, scale =2)light_data_ext |>filter(Id %in% participants[25:47]) |>gg_days(facetting =FALSE,y.axis.label ="Melanopic EDI (lx)") |>gg_state(setting_light, aes_fill = setting_light, alpha =0.5) +scale_fill_manual(values =c(Bed ="darkgrey",`Non-wear`="tomato",Indoors ="gold",Outdoors ="skyblue2",Mixed ="lightgreen" ) ) +facet_wrap(~Id, ncol =8) +labs(fill ="Condition")ggsave("figures/Fig3b_overview.png", width =14, height =5, scale =2)```## State analysisIn this section, we look at melanopic EDI for various conditions.First, we create a function that takes a variable and a filter variable for that column, and creates a histogram off that basis.```{r}#| label: Histogram plot functionhistogram_plot <-function(variable, filter) {light_data_ext |># drop_na({{ variable }}) |> filter({{ variable }} == filter) |>ggplot(aes(y=1, x = MEDI, slab_fill =after_stat(level))) +# geom_violin() +stat_histinterval(na.rm =TRUE, breaks =20, align =align_center(at =0))+stat_pointinterval(na.rm =TRUE, point_size =10, colour ="red") +scale_x_continuous(trans ="symlog", breaks =c(0, 10^(0:5)),labels =expression(0,10^0,10^1, 10^2, 10^3, 10^4, 10^5),limits =c(0, 10^5)) +theme_void() +scale_color_manual(values = scales::brewer_pal()(3)[-1], aesthetics ="slab_fill") +theme_sub_axis_x(text =element_text(size =30)) +guides(slab_fill ="none") +ylim(c(1,2))}```The second plot function takes the same inputs, and creates a doubleplot showing *when* these instances occur.```{r}#| label: Timeline plot functionlight_data_1day <-light_data_ext |>aggregate_Datetime(unit ="1 hour")timeline_plot <-function(variable, filter) {light_data_1day |> dplyr::mutate(state = {{ variable }} == filter,state =ifelse(is.na(state), FALSE, state) ) |>group_by(state) |>select(state, Datetime, MEDI) |>aggregate_Date(date.handler = \(x) "2025-09-22",numeric.handler = \(x) median(x, na.rm =TRUE),n =n(),lower =quantile(MEDI, p =0.25, na.rm =TRUE),upper =quantile(MEDI, p =0.75, na.rm =TRUE) ) |> dplyr::mutate(n = n/sum(n)) |>gap_handler(full.days =TRUE) |>gg_day(facetting =FALSE,geom ="blank",jco_color =FALSE, x.axis.breaks = hms::hms(hours =seq(0, 24, by =6)) ) +geom_ribbon(aes(ymin = lower, ymax = upper, fill = state), alpha =0.25 ) +geom_hline(yintercept =0) +geom_line(aes(colour = state), lwd =0.5 )+geom_point(aes(size = n, colour = state)) +scale_fill_manual(values =c(`TRUE`="#EFC000FF",`FALSE`="darkgrey")) +scale_color_manual(values =c(`TRUE`="#EFC000FF",`FALSE`="darkgrey")) +coord_cartesian(ylim =c(0, 10^5)) +scale_size_continuous(range =c(3, 20), limits =c(0,1)) +scale_alpha_continuous(range =c(0.01, 0.5)) +guides(lwd ="none", colour ="none", fill ="none", alpha ="none",size ="none") +theme_void() +theme_sub_axis(text =element_text(size =30)) +theme_sub_plot(margin =margin(r =35)) +scale_y_continuous(trans ="symlog", breaks =c(0, 10^(0:5)),labels =expression(0,10^0,10^1, 10^2, 10^3, 10^4, 10^5),limits =c(0, 10^5))}```Next, we create a function that also takes a variable, and creates a table containing durations, episodes, and some key metrics.```{r}#| label: Preprocessing function for the tableduration_tibble <-function(variable) { instances <- light_data_ext |># group_by() |>extract_states({{ variable }}) |>group_by({{ variable }}) |>summarize_numeric(prefix ="",remove =c("start", "end", "epoch"))# drop_na() metrics <- light_data_ext |>group_by({{ variable }}) |> dplyr::summarize(mean =mean(MEDI, na.rm =TRUE),min =min(MEDI, na.rm =TRUE),q025 =quantile(MEDI, probs =0.025, na.rm =TRUE),q250 =quantile(MEDI, probs =0.17, na.rm =TRUE),median =median(MEDI, na.rm =TRUE),q750 =quantile(MEDI, probs =0.83, na.rm =TRUE),q975 =quantile(MEDI, probs =0.975, na.rm =TRUE),max =max(MEDI, na.rm =TRUE), )# drop_na() variable_chr <- rlang::ensym(variable) |>as_string()left_join(instances, metrics, by = variable_chr) |>drop_na()}```Lastly, a function to create the actual table. It takes both the `duration_tibble()` and `histogram_plot()` functions and creates a `gt` html table based on it.```{r}#| label: Table functionduration_table <-function(variable) { variable_chr <- rlang::ensym(variable) |>as_string()duration_tibble({{ variable }}) |>arrange(desc(median)) |>gt(rowname_col = variable_chr) |>fmt_number(mean:max, decimals =0) |>cols_add(Plot = {{ variable }}, Plot2 = {{ variable }},.after = max) |>cols_add(rel_duration = total_duration/sum(total_duration), .after = total_duration) |>fmt_percent(rel_duration, decimals =0) |>cols_label(duration ="Episode duration",total_duration ="Total duration",episodes ="Episodes",mean ="Mean",min ="0%",q025 ="2.5%",q250 ="17%",median ="Median",q750 ="83%",q975 ="97.5%",max ="100%",Plot ="Histogram",Plot2 ="Timeline",rel_duration ="Relative duration" ) |>text_transform(locations =cells_body(Plot), fn = \(x) { gt::ggplot_image({ x %>%map(\(y) histogram_plot({{ variable }}, y)) }, height = gt::px(75), aspect_ratio =2) }) |>text_transform(locations =cells_body(Plot2), fn = \(x) { gt::ggplot_image({ x %>%map(\(y) timeline_plot({{ variable }}, y)) }, height = gt::px(75), aspect_ratio =2) }) |>fmt_duration(2:3, input_units ="seconds", max_output_units =2) |>tab_footnote("Scaled histograms show 66% of data in dark blue, 95% of data in dark & light blue, and the most extreme 5% of data in grey. The median is shown as a red dot.",locations =cells_column_labels(Plot) ) |>tab_footnote("Timeline plots show the median of the condition (yellow dot) against the rest of the data (grey dot). Colored bands show the interquartile range. Size of the dots indicates the relative number of instances, i.e., large dots indicate many occurances at a given time (relative to other times).",locations =cells_column_labels(Plot2) ) |>tab_footnote("Average duration of an episode in this category. eps. = Episode(s)",locations =cells_column_labels(duration) ) |>tab_footnote("An episode refers to a log entry in a category, until a different entry",locations =cells_column_labels(episodes) ) |># cols_units(mean = "lx",# median = "lx") |> gt::tab_spanner("Distribution", min:max) |>tab_footnote("Melanopic equivalent daylight illuminance (lx)",locations =list(cells_column_labels(c(mean)),cells_column_spanners() ) ) |>cols_merge_n_pct(total_duration, rel_duration) |>cols_merge(c(duration, episodes), pattern ="{1} ({2} eps.)" ) |>cols_align("center") |>tab_style(style =cell_fill(color ="grey"),locations =list(#cells_body(columns = c(min, max)),cells_column_labels(c(min, max))) ) |>tab_style(style =cell_fill(color ="tomato"),locations =list(#cells_body(median),cells_column_labels(median)) ) |>tab_style(style =cell_text(weight ="bold"),locations =list(cells_body(median),cells_stub(),cells_column_labels(median),cells_column_spanners()) ) |>tab_style(style =cell_fill(color ="#4880B8"),locations =list(#cells_body(columns = c(q250, q750)),cells_column_labels(c(q250, q750))) ) |>tab_style(style =cell_fill(color ="#C2DAE9"),locations =list(#cells_body(columns = c(q025, q975)),cells_column_labels(c(q025, q975))) ) |>cols_move_to_end(c(mean, duration, total_duration)) |>tab_header("Summary of melanopic EDI across log entries",subtitle =glue("Question/Characteristic: {label(light_data_ext[[variable_chr]])}") ) |>tab_style(style =cell_text(color ="red3"), locations =cells_stub(# columns = total_duration, # the column whose text you want to stylerows = rel_duration <0.05# condition using another column )) |>tab_footnote("Red indicates this category represents < 5% of entries",locations =list(cells_stub(rows = rel_duration <0.05) ) )}``````{r}#| label: Creating the overview tables#| message: falselabel(light_data_ext$setting_light) <-"What is your general context?"label(light_data_ext$autonomy) <-"Were the lighting conditions in this setting self-selected?"label(light_data_ext$setting_indoors_workingspace) <-"Indoors setting (work)"label(light_data_ext$scenario_daylight2) <-"Daylight conditions (stratified)"label(light_data_ext$scenario_electric2) <-"Electric lighting conditions (stratified)"label(light_data_ext$country) <-"Country"label(light_data_ext$sex) <-"Sex"c("social_context", "type", "setting_light", "setting_light2", "country", "sex", "autonomy", "setting_indoors", "setting_location", "setting_outdoors", "setting_mixed","nighttime", "nonwear_activity", "type.detail", "setting_specific", "setting_indoors_home", "setting_indoors_workingspace", "setting_indoors_healthfacility", "setting_indoors_learningfacility", "setting_indoors_retailfacility", "scenario_daylight", "scenario_electric", "screen_phone", "screen_tablet", "screen_pc", "screen_tv", "behaviour_change", "travel_time_zone", "scenario_daylight2", "scenario_electric2", "wear_activity" )|>walk(\(x) { symbol <-sym(x) table <-duration_table(!!symbol) tablegtsave(table, glue("tables/table_duration_{x}.png"), vwidth =1200) })```## Time above thresholdIn this section we calculate the time above threshold for the single day of `22 September 2025` across latitude and country.```{r}#| label: Time above 250 lx mel EDI across participantsTAT250 <-light_data_ext |>filter_Date(start ="2025-09-22", length ="1day") |>fill(latitude, longitude, city, country, .direction ="downup") |> dplyr::summarize(duration_above_threshold( MEDI, Datetime, threshold =250, na.rm =TRUE,as.df =TRUE ),latitude =unique(latitude),longitude =unique(longitude),city =unique(city),country =unique(country) ) |>mutate(n =n(), .by = country)TAT250_plot <-function(value){TAT250 |>ggplot(aes(x=fct_reorder(Id, duration_above_250), y = duration_above_250))+geom_col(aes(fill = {{ value }})) +scale_fill_viridis_b(labels = \(x) paste0(x, "°"))+scale_y_time(labels = \(x) x |>as_datetime(tz ="UTC") |>format("%H:%M"),expand =FALSE) +theme_minimal() +theme_sub_axis_x(text =element_blank(), line =element_line()) +theme_sub_panel(grid.major.x =element_blank())+labs(x ="Participants", y ="Time above 250lx mel EDI (HH:MM)")}# TAT250_plot(longitude) / (TAT250_plot(latitude) + ylim(c(10*3600, 0))) + plot_layout(axis = "collect", heights = c(1,0.05,1))TAT250_plot(latitude)ggsave("figures/Fig4_TAT250.png", width =5, height =2, scale =1.5)``````{r}#| label: Time above 250 lx mel EDI across countriesTAT250 |>ggplot(aes(y=fct_reorder(country, duration_above_250), x = duration_above_250))+geom_boxplot(aes(col = n)) +scale_color_viridis_b(breaks =c(1, 3, 5, 10, 15))+scale_x_time(labels = \(x) x |>as_datetime(tz ="UTC") |>format("%H:%M")) +theme_minimal() +# theme_sub_axis_x(text = element_blank(), line = element_line()) +theme_sub_panel(grid.major.y =element_blank())+labs(y =NULL, x ="Time above 250lx mel EDI (HH:MM)")ggsave("figures/Fig5_TAT250_country.png", width =4, height =3, scale =1.5)```## Global perspectivesIn this section we look at the global distribution of `melanopic EDI`, `Time above 250lx`, and the `Dose` of light.### Shade dataIn this section we calculate day and night times around the globe. As we want to use this information for a looped visualization, we will set a slightly different cutoff and only collect 48 hours.```{r}#| label: Set parameters for day/night# Time window (UTC)t_start <-ymd_hms("2025-09-21 12:00:00", tz ="UTC")t_end <-ymd_hms("2025-09-23 12:00:00", tz ="UTC")# Step between frames (adjust to taste: "30 mins", "1 hour", etc.)time_step <-"15 mins"# Spatial grid resolution (degrees). 2° keeps things light; 1° looks smoother but is heavier.lon_step <-0.5lat_step <-0.5# Darkness mapping: fully dark at civil twilight (-6°), linearly increasing from 0 to 1 as altitude goes 0 -> -6°.dark_full_altitude <--6# degrees# ---------- build lon/lat grid ----------lons <-seq(-180, 180, by = lon_step)lats <-seq(-90, 90, by = lat_step)grid <-expand.grid(lon = lons, lat = lats) |>as_tibble()# ---------- time sequence ----------times_utc <-seq(t_start, t_end, by = time_step)# ---------- compute sun altitude for each (lon, lat, time) ----------# We'll loop over time slices (efficient enough for this grid size),# compute altitude (in radians from suncalc), convert to degrees,# then map to a "darkness" alpha value.``````{r}#| label: Function to calculate sunpositioncompute_slice <-function(tt) {# build a data frame with the timestamp repeated for each grid point df <-data.frame(date =rep(tt, nrow(grid)),lat = grid$lat,lon = grid$lon )# call suncalc with a data frame instead of separate lat/lon vectors sp <- suncalc::getSunlightPosition(data = df, keep =c("altitude"))# altitude is returned in radians; convert to degrees alt_deg <- sp$altitude *180/ pi# darkness mapping (same as before) darkness <-pmin(1, pmax(0, -alt_deg /6))tibble(lon = grid$lon,lat = grid$lat,time = tt,alt_deg = alt_deg,darkness = darkness )}``````{r}#| label: Compute sunpositions# Create the progress bar oncepb <- progress_bar$new(format ="Computing slices [:bar] :percent eta: :eta",total =length(times_utc), clear =FALSE, width =70)shade_list <-vector("list", length(times_utc))for (i inseq_along(times_utc)) {# Compute one time slice (without pb$tick() inside) shade_list[[i]] <-compute_slice(times_utc[[i]])# Now update the bar pb$tick()}shade_df <- dplyr::bind_rows(shade_list)shade_df <-bind_rows(lapply(times_utc, compute_slice))```### Melanopic EDI#### Light data```{r}#| label: Collect mel EDI across lat, lon, and timelight_data_globe <-light_data_ext |>filter_Datetime(DatetimeUTC, start = t_start, end = t_end, tz ="UTC") |>select(Id, DatetimeUTC, latitude, longitude, MEDI) |> dplyr::mutate(lat2 = plyr::round_any(latitude, 12), lon2 = plyr::round_any(longitude, 12)) |>ungroup() |> dplyr::summarize(.by =c(DatetimeUTC, lat2, lon2),lat =mean(latitude),lon =mean(longitude),MEDI =mean(MEDI, na.rm =TRUE),n =n() ) |>rename(time = DatetimeUTC)light_globe <-st_as_sf(light_data_globe, coords =c("lon", "lat"), crs =4326)light_proj <-st_transform(light_globe, crs ="+proj=eqc")```#### Plot```{r}#| label: Plot mel EDI across the globe#| warning: falsep <-ggplot() +geom_sf(data = world_proj,fill ="grey",color =NA,size =0.25,alpha =0.5,show.legend =FALSE ) +geom_sf(data = tz_lines,colour ="deepskyblue3",alpha =0.75,linewidth =0.1) +geom_tile(data = shade_df,mapping =aes(x = lon, y = lat, alpha = darkness),fill ="black",width = lon_step, height = lat_step ) +geom_sf(data = light_proj,aes(size = n, fill = MEDI, colour = MEDI),alpha =0.9,shape =21,stroke =0.2 ) +# tile overlay for night shading (black with varying alpha)geom_sf_text(data = locations_proj,aes(label = n),size =1.9,fontface =2,color ="white",alpha =0.75 ) +scale_alpha(range =c(0, 0.6), limits =c(0, 1), guide ="none") +coord_sf(crs =4326, expand =FALSE, xlim =c(-180, 180), ylim =c(-90, 90)) +# coord_sf(crs = 4326, xlim = c(-180, 180), ylim = c(-90, 90)) +labs(title ="{format(frame_time, '%Y-%m-%d %H:%M:%S')} (UTC)",x =NULL, y =NULL ) +guides(size ="none", )+labs(fill ="mel EDI (lx)", colour ="mel EDI (lx)")+scale_fill_viridis_b(trans ="symlog", breaks =c(0, 10^(0:5)),labels =function(x) format(x, scientific =FALSE, big.mark =" "),limits =c(0, 10^5)) +scale_colour_viridis_b(trans ="symlog", breaks =c(0, 10^(0:5)),labels =function(x) format(x, scientific =FALSE, big.mark =" "),limits =c(0, 10^5)) +scale_size_continuous(range =c(3, 7.5)) +theme_minimal(base_size =13) +theme(panel.grid.major =element_line(size =0.1, colour ="grey85"),panel.grid.minor =element_blank(),plot.title =element_text(face ="bold"),legend.text.align =1,legend.text.position ="left",legend.position ="inside",legend.position.inside =c(0.1,0.35),legend.background =element_rect(fill =alpha("white", alpha =0.25),colour =NA) ) +transition_time(time)# Frame rate and dimensionsfps_out <-12width_px <-1400height_px <-800if(interactive()) {anim <-animate( p,duration =48,fps = fps_out,width = width_px,height = height_px,res =150,renderer =gifski_renderer(loop =TRUE))out_file <-"figures/Fig6_melEDI_global.gif"anim_save(out_file, animation = anim)}```## Export for dashboardIn this section we export key data for the dashboard.```{r}save( data_devices, events_complete, light_data_ext, histogram_plot, light_data_1day, timeline_plot, duration_tibble, duration_table, TAT250,file ="data/cleaned/dashboard.RData")```